Optimal feature selection for sparse linear discriminant analysis and its applications in gene expression data

نویسندگان

  • Cheng Wang
  • Longbing Cao
  • Baiqi Miao
چکیده

This work studies the theoretical rules of feature selection in linear discriminant analysis (LDA) and a new feature selection method is proposed for sparse linear discriminant analysis. A l1 minimization method is used to select the important features from which LDA will be constructed. The asymptotic results of this proposed Two-stage LDA (TLDA) are studied, demonstrating that TLDA is an optimal classification rule whose convergence rate is the best compared to existing methods. The experiments on simulated and real data sets are consistent with the theoretical results and show that TLDA performs favorably in comparison with current methods. Overall, TLDA uses a lower minimum number of features or genes than other approaches to achieve a better result with a reduced misclassification rate.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Gene Identification from Microarray Data for Diagnosis of Acute Myeloid and Lymphoblastic Leukemia Using a Sparse Gene Selection Method

Background: Microarray experiments can simultaneously determine the expression of thousands of genes. Identification of potential genes from microarray data for diagnosis of cancer is important. This study aimed to identify genes for the diagnosis of acute myeloid and lymphoblastic leukemia using a sparse feature selection method. Materials and Methods: In this descriptive study, the expressio...

متن کامل

Multi-Dimensional Feature Scoring For Gene Expression Data

Motivation: The analysis of gene expression data presents researchers with the problem of finding optimal subsets of genes to focus on. This is a computational and statistical challenge, mostly due to the high-dimensionality of the data and the small amounts of samples. Hence, an initial process of gene (feature) selection is usually performed. Results: This paper discusses several methods that...

متن کامل

A Multi Linear Discriminant Analysis Method Using a Subtraction Criteria

Linear dimension reduction has been used in different application such as image processing and pattern recognition. All these data folds the original data to vectors and project them to an small dimensions. But in some applications such we may face with data that are not vectors such as image data. Folding the multidimensional data to vectors causes curse of dimensionality and mixed the differe...

متن کامل

Supervised Feature Extraction of Face Images for Improvement of Recognition Accuracy

Dimensionality reduction methods transform or select a low dimensional feature space to efficiently represent the original high dimensional feature space of data. Feature reduction techniques are an important step in many pattern recognition problems in different fields especially in analyzing of high dimensional data. Hyperspectral images are acquired by remote sensors and human face images ar...

متن کامل

Feature Selection and Classification of Microarray Gene Expression Data of Ovarian Carcinoma Patients using Weighted Voting Support Vector Machine

We can reach by DNA microarray gene expression to such wealth of information with thousands of variables (genes). Analysis of this information can show genetic reasons of disease and tumor differences. In this study we try to reduce high-dimensional data by statistical method to select valuable genes with high impact as biomarkers and then classify ovarian tumor based on gene expression data of...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Computational Statistics & Data Analysis

دوره 66  شماره 

صفحات  -

تاریخ انتشار 2013